Graph-structured data classification based on spectral methods and the generalized likelihood ratio test

An application to Alzheimer's disease diagnosis from PET image data


1. Introduction

Many real-world datasets are not optimally represented in Euclidean space as they naturally lie on irregular structures such as graphs (e.g., molecules, social networks, or sensor networks). In particular, brain imaging data often exhibit structured relationships between regions. Instead of treating features independently, in this project, I model data for each subject as a signal on a graph, where nodes represent anatomical brain regions and edge weights encode connectivity between regions across subjects. The classification task is formulated as a hypothesis test between two representative graphs (healthy group versus diseased group), using spectral properties of graph signals. This approach, building on the framework introduced by Hu et al. [1], is applied here for the first time here to multi-dimensional graph signals: regional statistical features derived from images obtained via positron emition tomography (PET). Although the application explored here is brain image classification, the core method—treating structured data as signals defined on graphs, transforming them using the graph Fourier transform, and performing classification via a generalized likelihood test (GLRT)—is general and applies to any graph-structured data. Notably, the ideas on graph spectral analysis discussed here are closely related to the foundations of graph deep learning. Indeed, early graph convolutional neural networks (CNNs) defined convolution via eigendecomposition of the graph Laplacian [2], while later architectures such as ChebNet and the graph convolutional network (GCN) introduced efficient polynomial or first-order approximations of spectral graph convolutions [3] [4]. Although I conducted this short project more than ten years ago during my postgraduate studies, this article highlights core concepts that remain relevant to modern machine learning.

2. Data

The data used in this work contains statistical features extracted from 142 segmented brain FDG-PET images. These images were provided by La Timone Hospital (Marseille) and correspond to 61 healthy control subjects and 81 patients with Alzheimer's disease. Each brain image was segmented into 116 regions of interest (ROIs) using WFU-PickAtlas. For each region, the first four moments (mean, variance, skewness, and kurtosis) and the entropy were computed; they proved to be useful for the discrimination between the two classes (healthy control and Alzheimer's disease) in prior work from Garali et al. [5]. The resulting data are stored as two tensors of size 116 × 5 × $N$, where $N$ denotes the number of subjects in each class.

3. Methods

Given the 5-dimensional feature vector $x$...

References

  1. Chenhui Hu, Jorge Sepulcre, Keith A. Johnson, Georges E. Fakhri, Yue M. Lu, and Quanzheng Li, "Matched signal detection on graphs: Theory and application to brain imaging data classification", NeuroImage (2016)
  2. Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun, "Spectral networks and locally connected networks on graphs", arXiv preprint arXiv:1312.6203 (2013)
  3. Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst, "Convolutional neural networks on graphs with fast localized spectral filtering", Advances in neural information processing systems (2016)
  4. Thomas N. Kipf and Max Welling, "Semi-supervised classification with graph convolutional networks", arXiv preprint arXiv:1609.02907 (2016)
  5. Imène Garali, Mouloud Adel, Salah Bourennane, and Eric Guedj, "Histogram-based features selection and volume of interest ranking for brain PET image classification", IEEE journal of translational engineering in health and medicine (2018)
Published on April 24, 2026, last update on April 24, 2026